Turkish Discourse Bank: Porting a discourse annotation style to a morphologically rich language

نویسنده

  • Deniz Zeyrek
چکیده

This paper describes the current state of the Turkish Discourse Bank, the first publicly available annotated discourse resource for Turkish. It describes the annotation methods and the challenges posed by annotating Turkish, a free word order language with rich morphology. It shows the usefulness of the PDTB style annotation but points out the need to expand this annotation style with the needs of the target language.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Annotation Scheme of the Turkish Discourse Bank and an Evaluation of Inconsistent Annotations

In this paper, we report on the annotation procedures we developed for annotating the Turkish Discourse Bank (TDB), an effort that extends the Penn Discourse Tree Bank (PDTB) annotation style by using it for annotating Turkish discourse. After a brief introduction to the TDB, we describe the annotation cycle and the annotation scheme we developed, defining which parts of the scheme are an exten...

متن کامل

Cross-Domain and Cross-Language Porting of Shallow Parsing

English was the main focus of attention of the Natural Language Processing (NLP) community for years. As a result, there are significantly more annotated linguistic resources in English than in any other language. Consequently, data-driven tools for automatic text or speech processing are developed mainly for English. Developing similar corpora and tools for other languages is an important issu...

متن کامل

PDTB-style Discourse Annotation of Chinese Text

We describe a discourse annotation scheme for Chinese and report on the preliminary results. Our scheme, inspired by the Penn Discourse TreeBank (PDTB), adopts the lexically grounded approach; at the same time, it makes adaptations based on the linguistic and statistical characteristics of Chinese text. Annotation results show that these adaptations work well in practice. Our scheme, taken toge...

متن کامل

Annotating Subordinators in the Turkish Discourse Bank

In this paper we explain how we annotated subordinators in the Turkish Discourse Bank (TDB), an effort that started in 2007 and is still continuing. We introduce the project and describe some of the issues that were important in annotating three subordinators, namely karşın, rağmen and halde, all of which encode the coherence relation Contrast-Concession. We also describe the annotation tool.

متن کامل

TDB 1.1: Extensions on Turkish Discourse Bank

In this paper we present the recent developments on Turkish Discourse Bank (TDB). We first summarize the resource and present an evaluation. Then, we describe TDB 1.1, i.e. enrichments on 10% of the corpus (namely, added senses for explicit discourse connectives and new annotations for implicit relations, entity relations and alternative lexicalizations). We explain the method of annotation and

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • D&D

دوره 4  شماره 

صفحات  -

تاریخ انتشار 2013